A genome-wide survey of human pseudogenes.
نویسندگان
چکیده
We screened all intergenic regions in the human genome to identify pseudogenes with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS). We identified 19,724 regions of which 95% +/- 3% are estimated to evolve neutrally and thus are likely to encode pseudogenes. Half of these have no detectable truncation in their pseudocoding regions and therefore are not identifiable by methods that require the presence of truncations to prove nonfunctionality. A comparative analysis with the mouse genome showed that 70% of these pseudogenes have a retrotranspositional origin (processed), and the rest arose by segmental duplication (nonprocessed). Although the spread of both types of pseudogenes correlates with chromosome size, nonprocessed pseudogenes appear to be enriched in regions with high gene density. It is likely that the human pseudogenes identified here represent only a small fraction of the total, which probably exceeds the number of genes.
منابع مشابه
Genome-Wide Survey for Biologically Functional Pseudogenes
According to current estimates there exist about 20,000 pseudogenes in a mammalian genome. The vast majority of these are disabled and nonfunctional copies of protein-coding genes which, therefore, evolve neutrally. Recent findings that a Makorin1 pseudogene, residing on mouse Chromosome 5, is, indeed, in vivo vital and also evolutionarily preserved, encouraged us to conduct a genome-wide surve...
متن کاملHOPPSIGEN: a database of human and mouse processed pseudogenes
Processed pseudogenes result from reverse transcribed mRNAs. In general, because processed pseudogenes lack promoters, they are no longer functional from the moment they are inserted into the genome. Subsequently, they freely accumulate substitutions, insertions and deletions. Moreover, the ancestral structure of processed pseudogenes could be easily inferred using the sequence of their functio...
متن کاملPseudogenes in metazoa: origin and features.
The complete genome sequences with their annotations are a considerable resource in biology, particularly in understanding the global structure of the genetic material at the molecular level. The reason why some eukaryotic genomes contain large quantities of apparently unnecessary DNA, namely pseudogenes, while others seem to invest in more efficient thinning processes or are equipped with prot...
متن کاملdreamBase: DNA modification, RNA regulation and protein binding of expressed pseudogenes in human health and disease
Although thousands of pseudogenes have been annotated in the human genome, their transcriptional regulation, expression profiles and functional mechanisms are largely unknown. In this study, we developed dreamBase (http://rna.sysu.edu.cn/dreamBase) to facilitate the investigation of DNA modification, RNA regulation and protein binding of potential expressed pseudogenes from multidimensional hig...
متن کاملSNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes.
SNPs are useful for genome-wide mapping and the study of disease genes. Previous studies have focused on SNPs in specific genes or SNPs pooled from a variety of different sources. Here, a systematic approach to the analysis of SNPs in relation to various features on a genome-wide scale, with emphasis on protein features and pseudogenes, is presented. We have performed a comprehensive analysis o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Genome research
دوره 13 12 شماره
صفحات -
تاریخ انتشار 2003